Goto

Collaborating Authors

 Kent County


'Uncanny Valley': ICE's Secret Expansion Plans, Palantir Workers' Ethical Concerns, and AI Assistants

WIRED

In this episode of, our hosts dive into WIRED's scoop about a secret Trump administration campaign extending right into your backyard. This week, hosts Brian Barrett, Leah Feiger, and Zoë Schiffer discuss WIRED's big scoop on ICE's startling plans to expand to nearly every state in the US. Plus, a WIRED writer lets the viral AI assistant OpenClaw run his life for a week to give listeners a peek of what AI agents can and can't do. ICE Is Expanding Across the US at Breakneck Speed. Write to us at uncannyvalley@wired.com . You can always listen to this week's podcast through the audio player on this page, but if you want to subscribe for free to get every episode, here's how: If you're on an iPhone or iPad, open the app called Podcasts, or just tap this link . I want to continue a conversation that we started yesterday in Slack after work hours for some of us. And this is about the men's short program-- But very specifically want to pick up on the conversation where Zoë had very strong feelings about the results of men's figure skating. I feel like we need to back up because you and Leah authentically care about the Olympics so much and I think just know more about sports than I do. I deeply have never engaged with sports ever, just as a whole rule, as a category. It doesn't exist in my life. Say the lines, say the lines, Zoë, or I'm going to read them verbatim from slack. Wait, I don't even know what you're talking about. I was merely surprised when I watched because the Americans went, I thought, wow, that guy basically fell over and was clumping around the ice, and then Japan went, and they were sailing around like little swans, and then when the gold medal came, it went to the Americans. I couldn't believe what had happened. No one else seemed outraged. For a little backup for our non-ice skating Olympic fans, I was always referring to Ilia Malinin, who a number of publications and sports experts say might actually be one of the greatest figure skaters of all time.


A survey of using EHR as real-world evidence for discovering and validating new drug indications

Talukdar, Nabasmita, Zhang, Xiaodan, Paithankar, Shreya, Wang, Hui, Chen, Bin

arXiv.org Artificial Intelligence

Electronic Health Records (EHRs) have been increasingly used as real-world evidence (RWE) to support the discovery and validation of new drug indications. This paper surveys current approaches to EHR-based drug repurposing, covering data sources, processing methodologies, and representation techniques. It discusses study designs and statistical frameworks for evaluating drug efficacy. Key challenges in validation are discussed, with emphasis on the role of large language models (LLMs) and target trial emulation. By synthesizing recent developments and methodological advances, this work provides a foundational resource for researchers aiming to translate real-world data into actionable drug-repurposing evidence.


Unlocking the Potential of Global Human Expertise

Neural Information Processing Systems

For example, in the Pandemic Response Challenge experiment, the context consisted of data about the geographic region for which the predictions were made, e.g., historical data of COVID-19 cases and intervention policies; actions were future schedules of intervention policies for the region; and outcomes were predicted future cases of COVID-19 along with the stringency


Unlocking the Potential of Global Human Expertise

Neural Information Processing Systems

For example, in the Pandemic Response Challenge experiment, the context consisted of data about the geographic region for which the predictions were made, e.g., historical data of COVID-19 cases and intervention policies; actions were future schedules of intervention policies for the region; and outcomes were predicted future cases of COVID-19 along with the stringency


An Illusion of Progress? Assessing the Current State of Web Agents

Xue, Tianci, Qi, Weijian, Shi, Tianneng, Song, Chan Hee, Gou, Boyu, Song, Dawn, Sun, Huan, Su, Yu

arXiv.org Artificial Intelligence

As digitalization and cloud technologies evolve, the web is becoming increasingly important in the modern society. Autonomous web agents based on large language models (LLMs) hold a great potential in work automation. It is therefore important to accurately measure and monitor the progression of their capabilities. In this work, we conduct a comprehensive and rigorous assessment of the current state of web agents. Our results depict a very different picture of the competency of current agents, suggesting over-optimism in previously reported results. This gap can be attributed to shortcomings in existing benchmarks. We introduce Online-Mind2Web, an online evaluation benchmark consisting of 300 diverse and realistic tasks spanning 136 websites. It enables us to evaluate web agents under a setting that approximates how real users use these agents. To facilitate more scalable evaluation and development, we also develop a novel LLM-as-a-Judge automatic evaluation method and show that it can achieve around 85% agreement with human judgment, substantially higher than existing methods. Finally, we present the first comprehensive comparative analysis of current web agents, highlighting both their strengths and limitations to inspire future research.


NIRVANA: Structured pruning reimagined for large language models compression

Ai, Mengting, Wei, Tianxin, Chen, Sirui, He, Jingrui

arXiv.org Artificial Intelligence

To address these critical shortcomings, we introduce NIRV ANA, a novel pruning method explicitly designed to balance immediate zero-shot accuracy preservation with robust fine-tuning capability. Transformer-based (V aswani et al., 2017) large language models (LLMs) have revolutionized natural To alleviate this critical bottleneck, model compression techniques--particularly pruning (LeCun et al., 1989)--emerge as an essential strategy, aiming to create lighter, more accessible models These two can also be applied for semi-structured pruning. This oversight often results in suboptimal pruning choices, impairing model performance. To address these critical gaps, we introduce NIRV ANA (NTK-InfoRmed adaptiVe neuron & AttentioN heAd pruning), a novel structured pruning method that tightly integrates pruning decisions with model fine-tuning dynamics through the lens of the Neural Tangent Kernel (NTK) (Jacot et al., 2018). An adaptive sparsity allocation strategy that dynamically adjusts pruning ratios across layers and modules, explicitly addressing overlooked disparities in existing pruning methodologies. Recent unstructured pruning methods, such as SparseGPT (Frantar and Alistarh, 2023) and Wanda (Sun et al., 2023), prune individual weights Semi-structured methods address this by imposing fixed patterns (e.g., 2:4 sparsity (Fang et al., 2024; Zheng et al., 2024)), yet still struggle to support efficient training and require specialized hardware. ShortGPT (Men et al., 2024) introduce global or layer-wise pruning strategies, yet do not explicitly SliceGPT (Ashkboos et al., 2024) applies PCA-based transformations per block, but remains highly sensitive to calibration data, reflecting a broader Table 4. Since most of the current LLMs are based on SwiGLU Shazeer (2020) structure, we focus Neural Tangent Kernel (NTK) (Jacot et al., 2018) provides a kernel-based framework for analyzing See the details of the derivation in Section A.6 3.2 P Consequently, popular practices include fixing the weights (i.e., setting In Llama3's implementation, which employs Grouped Query Attention (GQA), multiple query heads share Without loss of generality, our analysis can be extended to the vector-output case.


Playstyle and Artificial Intelligence: An Initial Blueprint Through the Lens of Video Games

Lin, Chiu-Chou

arXiv.org Artificial Intelligence

Contemporary artificial intelligence (AI) development largely centers on rational decision-making, valued for its measurability and suitability for objective evaluation. Y et in real-world contexts, an intelligent agent's decisions are shaped not only by logic but also by deeper influences such as beliefs, values, and preferences. The diversity of human decision-making styles emerges from these differences, highlighting that "style" is an essential but often overlooked dimension of intelligence. This dissertation introduces playstyle as an alternative lens for observing and analyzing the decision-making behavior of intelligent agents, and examines its foundational meaning and historical context from a philosophical perspective. By analyzing how beliefs and values drive intentions and actions, we construct a two-tier framework for style formation: the external interaction loop with the environment and the internal cognitive loop of deliberation. On this basis, we formalize style-related characteristics and propose measurable indicators such as style capacity, style popularity, and evolutionary dynamics. The study focuses on three core research directions: (1) Defining and measuring playstyle, proposing a general playstyle metric based on discretized state spaces, and extending it to quantify strategic diversity and competitive balance; (2) Expressing and generating playstyle, exploring how reinforcement learning and imitation learning can be used to train agents exhibiting specific stylistic tendencies, and introducing a novel approach for human-like style learning and modeling; and (3) Practical applications, analyzing the potential of these techniques in domains such as game design and interactive entertainment. Finally, the dissertation outlines future extensions, including the role of style as a core element in building artificial general intelligence (AGI). By investigating stylistic variation, we aim to rethink autonomy, value expression, and even offer a tangible perspective on the ultimate i philosophical question: What is the soul?


Contemplative Artificial Intelligence

Laukkonen, Ruben, Inglis, Fionn, Chandaria, Shamil, Sandved-Smith, Lars, Lopez-Sola, Edmundo, Hohwy, Jakob, Gold, Jonathan, Elwood, Adam

arXiv.org Artificial Intelligence

As artificial intelligence (AI) improves, traditional alignment strategies may falter in the face of unpredictable self-improvement, hidden subgoals, and the sheer complexity of intelligent systems. Inspired by contemplative wisdom traditions, we show how four axiomatic principles can instil a resilient Wise World Model in AI systems. First, mindfulness enables self-monitoring and recalibration of emergent subgoals. Second, emptiness forestalls dogmatic goal fixation and relaxes rigid priors. Third, non-duality dissolves adversarial self-other boundaries. Fourth, boundless care motivates the universal reduction of suffering. We find that prompting AI to reflect on these principles improves performance on the AILuminate Benchmark (d=.96) and boosts cooperation and joint-reward on the Prisoner's Dilemma task (d=7+). We offer detailed implementation strategies at the level of architectures, constitutions, and reinforcement on chain-of-thought. For future systems, active inference may offer the self-organizing and dynamic coupling capabilities needed to enact Contemplative AI in embodied agents.


Multi-Hazard Early Warning Systems for Agriculture with Featural-Temporal Explanations

Zheng, Boyuan, Chu, Victor W.

arXiv.org Artificial Intelligence

The situation is evolving due to climate change and hence such systems should have the intelligent to continue to learn from recent climate behaviours. However, traditional single-hazard forecasting methods fall short in capturing complex interactions among concurrent climatic events. To address this deficiency, in this paper, we combine sequential deep learning models and advanced Explainable Artificial Intelligence (XAI) techniques to introduce a multi-hazard forecasting framework for agriculture. In our experiments, we utilize meteorological data from four prominent agricultural regions in the United States (between 2010 and 2023) to validate the predictive accuracy of our framework on multiple severe event types, which are extreme cold, floods, frost, hail, heatwaves, and heavy rainfall, with tailored models for each area. The framework uniquely integrates attention mechanisms with TimeSHAP (a recurrent XAI explainer for time series) to provide comprehensive temporal explanations revealing not only which climatic features are influential but precisely when their impacts occur. Our results demonstrate strong predictive accuracy, particularly with the BiLSTM architecture, and highlight the system's capacity to inform nuanced, proactive risk management strategies.


Geological Inference from Textual Data using Word Embeddings

Linphrachaya, Nanmanas, Gómez-Méndez, Irving, Siripatana, Adil

arXiv.org Artificial Intelligence

This research explores the use of Natural Language Processing (NLP) techniques to locate geological resources, with a specific focus on industrial minerals. By using word embeddings trained with the GloVe model, we extract semantic relationships between target keywords and a corpus of geological texts. The text is filtered to retain only words with geographical significance, such as city names, which are then ranked by their cosine similarity to the target keyword. Dimensional reduction techniques, including Principal Component Analysis (PCA), Autoencoder, Variational Autoencoder (VAE), and VAE with Long Short-Term Memory (VAE-LSTM), are applied to enhance feature extraction and improve the accuracy of semantic relations. For benchmarking, we calculate the proximity between the ten cities most semantically related to the target keyword and identified mine locations using the haversine equation. The results demonstrate that combining NLP with dimensional reduction techniques provides meaningful insights into the spatial distribution of natural resources. Although the result shows to be in the same region as the supposed location, the accuracy has room for improvement.